Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tabular Q solution example #125

Merged
merged 17 commits into from
Mar 20, 2021

Conversation

JD-ETH
Copy link
Contributor

@JD-ETH JD-ETH commented Mar 10, 2021

@ChrisCummins
A working example with tabular Q-learning.

As this is my first real PR please be harsh about formatting, logging, commenting, styling, and point me to guidelines that I have missed. Thanks!

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Mar 10, 2021
Copy link
Contributor

@ChrisCummins ChrisCummins left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This really nice @JD-ETH, thanks a lot! Having a value-iteration baseline is a very welcome addition.

In general the implementation is solid, but since you asked, I have gone through and left a very pedantic review 😃 Don't worry, this level of scrutiny is not normal! Most things are small nitpicks or bikeshedding. The most important change I'd love to see is to bring the level of commenting up to the point of this being a reasonably-standalone explanation of the technique for a general non-expert audience.

Leave a comment here once you want me to take another look.

Cheers,
Chris

examples/brute_force.py Outdated Show resolved Hide resolved
examples/brute_force.py Outdated Show resolved Hide resolved
examples/tabular_q.py Outdated Show resolved Hide resolved
examples/tabular_q.py Outdated Show resolved Hide resolved
examples/tabular_q.py Outdated Show resolved Hide resolved
examples/tabular_q.py Outdated Show resolved Hide resolved
examples/tabular_q.py Outdated Show resolved Hide resolved
examples/tabular_q.py Outdated Show resolved Hide resolved
examples/brute_force.py Outdated Show resolved Hide resolved
examples/brute_force.py Show resolved Hide resolved
@ChrisCummins
Copy link
Contributor

ChrisCummins commented Mar 11, 2021

Oo one thing I forgot - please add a test 😊

Given that this isn't core API, a simple smoke test using flags for quick run like --episodes=5 would be sufficient. Take a look at the actor critic smoke test as an example: https://github.com/facebookresearch/CompilerGym/blob/development/examples/actor_critic_test.py

You will then need to add bazel definitions so that the test is run on make test: https://github.com/facebookresearch/CompilerGym/blob/development/examples/BUILD#L8-L25

EDIT: Also worth noting that this test will fail until this PR is rebased on #126.

@JD-ETH
Copy link
Contributor Author

JD-ETH commented Mar 16, 2021

@ChrisCummins I'm ready for the next round.

Copy link
Contributor

@ChrisCummins ChrisCummins left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @JD-ETH, thanks a lot for making those changes, I really appreciate the extra docs and the test!

This LGTM and feel free to merge it when you're ready. You may want to squash the history before merging into smaller atomic commits, perhaps one for the brute_force fix and the other for the tabulary_q, but that's not essential so I'll leave it up to your discretion :) Thanks for adding this new baseline!

Cheers,
Chris

examples/tabular_q.py Outdated Show resolved Hide resolved
Co-authored-by: Chris Cummins <chrisc.101@gmail.com>
@JD-ETH JD-ETH merged commit 6b9dbd1 into facebookresearch:development Mar 20, 2021
JD-ETH added a commit that referenced this pull request Mar 20, 2021
@JD-ETH
Copy link
Contributor Author

JD-ETH commented Mar 20, 2021

@ChrisCummins Now i see CI is failing from this PR, sorry for merging it before the CI tests finished. I am not sure I understand what failed in the test though, would you give some pointers?

@ChrisCummins
Copy link
Contributor

@ChrisCummins Now i see CI is failing from this PR, sorry for merging it before the CI tests finished. I am not sure I understand what failed in the test though, would you give some pointers?

Thanks for checking. It's a false positive, nothing to worry about :). See #144.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants